Skip to content

feat: QUIC agent tunnel — protocol, listener, agent client#1738

Open
irvingouj@Devolutions (irvingoujAtDevolution) wants to merge 14 commits intomasterfrom
feat/quic-tunnel-1-core
Open

feat: QUIC agent tunnel — protocol, listener, agent client#1738
irvingouj@Devolutions (irvingoujAtDevolution) wants to merge 14 commits intomasterfrom
feat/quic-tunnel-1-core

Conversation

@irvingoujAtDevolution
Copy link
Copy Markdown
Contributor

@irvingoujAtDevolution irvingouj@Devolutions (irvingoujAtDevolution) commented Apr 2, 2026

Summary

QUIC-based agent tunnel (PR 1 of 4). Agents in private networks connect outbound to Gateway via QUIC/mTLS, advertise reachable subnets and domains, and proxy TCP connections. Pure Rust (Quinn + rustls), zero C dependencies.

See Technical Spec for protocol details.

PR stack

  1. Protocol + Tunnel Core (this PR)
  2. Transparent Routing
  3. Auth + Webapp
  4. Deployment + Installer

Highlights

  • Quinn QUIC transport with mTLS (private PKI)
  • CSR-based enrollment (private key never leaves agent)
  • Auto-reconnect with exponential backoff
  • AD domain auto-detection
  • Bounded deserialization, buffer limits, connection limits
  • 32 + 15 tests

🤖 Generated with Claude Code

@irvingoujAtDevolution
Copy link
Copy Markdown
Contributor Author

QUIC Agent Tunnel — Technical Specification

1. Enrollment

How an agent gets its certificate

Admin                        Agent Machine                  Gateway
  │                              │                             │
  │  Click "Enroll Agent"        │                             │
  │  in DVLS / Gateway webapp    │                             │
  │                              │                             │
  │  Copy enrollment string      │                             │
  │  dgw-enroll:v1:base64...    │                             │
  │                              │                             │
  │  Paste into MSI installer    │                             │
  │  or CLI command              │                             │
  │                              │                             │
  │                              │  1. Decode enrollment string │
  │                              │     → gateway URL            │
  │                              │     → one-time token         │
  │                              │     → QUIC endpoint          │
  │                              │                             │
  │                              │  2. Generate ECDSA P-256     │
  │                              │     key pair LOCALLY         │
  │                              │     Write key to disk (0600) │
  │                              │                             │
  │                              │  3. Generate CSR             │
  │                              │     (public key + signature) │
  │                              │                             │
  │                              │── POST /enroll ────────────>│
  │                              │   { agent_name, csr_pem }   │
  │                              │   Bearer: <one-time-token>  │
  │                              │                             │
  │                              │                             │  4. Validate token
  │                              │                             │     (consumed, cannot replay)
  │                              │                             │  5. Verify CSR signature
  │                              │                             │  6. Assign agent UUID
  │                              │                             │  7. Sign cert with CA
  │                              │                             │     (embed UUID in SAN)
  │                              │                             │
  │                              │<────────────────────────────│
  │                              │   { agent_id, cert_pem,     │
  │                              │     ca_cert_pem, endpoint } │
  │                              │                             │
  │                              │  8. Write cert + CA cert     │
  │                              │  9. Update agent.json        │
  │                              │     (Tunnel section only,    │
  │                              │      preserves other config) │
  │                              │                             │
  │                              │  10. Connect via QUIC ──────>│
  │                              │      (mTLS with new cert)    │

Key property: the private key never leaves the agent machine.
Only the CSR (containing the public key and a proof-of-possession signature) is transmitted.
The enrollment response contains only the signed certificate and the CA certificate — no secrets.

Enrollment token

The enrollment token is either:

  • A one-time UUID (122-bit entropy) generated by the gateway — consumed atomically on use, cannot be replayed.
  • A static secret from gateway configuration — compared in constant time.

2. Stream Multiplexing

One QUIC connection, many independent streams

Agent ←──── single QUIC connection ────→ Gateway
             │
             ├── Stream 0 (control, always open)
             │   Agent → GW:  RouteAdvertise every 30s
             │   Agent → GW:  Heartbeat every 60s
             │   GW → Agent:  HeartbeatAck
             │
             ├── Stream 1 (RDP session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.5:3389 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (RDP protocol data)
             │
             ├── Stream 5 (SSH session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.10:22 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (SSH protocol data)
             │
             └── Stream 9 (SSH session #2)
                 GW → Agent:  ConnectMessage { target: 10.0.0.20:22 }
                 Agent → GW:  ConnectResponse::Success
                 Then: raw bidirectional bytes (SSH protocol data)

Each stream is independently ordered.
A retransmission on stream 1 does not block streams 5 or 9.
This is QUIC's core advantage over TCP — no head-of-line blocking across streams.

How a new session is established

  1. Gateway allocates the next server-initiated stream ID (1, 5, 9, 13, …).
  2. Gateway writes a length-prefixed ConnectMessage to the new stream.
  3. Agent reads the stream, decodes the ConnectMessage.
  4. Agent validates the target IP is within its advertised subnets (security boundary).
  5. Agent opens a TCP connection to the target.
  6. Agent writes ConnectResponse::Success back on the same stream.
  7. From this point, every byte on the QUIC stream is forwarded 1:1 to/from the TCP connection.

No new QUIC handshake is needed — streams are opened instantly on the existing connection.

Message encoding

All control and session setup messages use length-prefixed bincode:

┌─────────────────────────┬──────────────────────────────┐
│ 4 bytes (big-endian u32)│ N bytes (bincode payload)    │
│ message_length = N      │                              │
└─────────────────────────┴──────────────────────────────┘

After ConnectResponse::Success, the stream carries raw bytes — no framing, no headers.
The gateway and agent act as transparent TCP proxies.

Size limits

Message type Max size Purpose
Control messages 1 MiB RouteAdvertise, Heartbeat
Session messages 64 KiB ConnectMessage, ConnectResponse

Limits are enforced on the length prefix (before reading the payload) and on the bincode deserializer (prevents crafted payloads with huge internal Vec lengths).

3. User Experience

Network topology

┌─────────────────────────────────────────────────────────┐
│  Cloud                                                   │
│  ┌──────────────────┐                                   │
│  │ Devolutions      │                                   │
│  │ Gateway          │  ← publicly reachable              │
│  │ gateway.acme.com │                                   │
│  └────────┬─────────┘                                   │
│           │ QUIC (UDP 4433)                              │
└───────────┼─────────────────────────────────────────────┘
            │
       ─ ─ ─│─ ─ ─ ─ ─ ─ firewall (outbound only) ─ ─ ─ ─
            │
┌───────────┼─────────────────────────────────────────────┐
│  Office   │                                              │
│  ┌────────┴─────────┐    ┌──────────┐  ┌──────────┐    │
│  │ Agent            │    │ DC       │  │ File     │    │
│  │ 10.10.0.8        │───→│ 10.10.0.3│  │ Server   │    │
│  │ advertises:      │    │ (RDP+KDC)│  │ 10.10.0.5│    │
│  │  10.10.0.0/24    │    └──────────┘  └──────────┘    │
│  │  contoso.local   │                                   │
│  └──────────────────┘                                   │
└─────────────────────────────────────────────────────────┘

Admin setup (one-time)

  1. Open Gateway webapp → Agents → Enroll Agent.
  2. Copy the enrollment string.
  3. On the agent machine: devolutions-agent up --enrollment-string "dgw-enroll:v1:...".
  4. Agent enrolls, connects, starts advertising 10.10.0.0/24 + contoso.local.

End-user workflow (daily use)

The user has no awareness of the agent. From their perspective:

  1. Open RDM or Gateway webapp.
  2. Create an RDP connection to 10.10.0.3.
  3. Click connect.
  4. The RDP desktop appears.

What happens behind the scenes:

User's browser
  → WebSocket to Gateway (gateway.acme.com)
    → Gateway routing: 10.10.0.3 matches agent's 10.10.0.0/24 subnet
      → Gateway opens QUIC stream 5 to agent
        → ConnectMessage { target: "10.10.0.3:3389" }
          → Agent connects TCP to 10.10.0.3:3389
            → ConnectResponse::Success
              → RDP data flows bidirectionally

No VPN. No inbound firewall rules on the office network. No routing configuration.

Transparent routing rules

When a connection request arrives, the gateway evaluates routing in priority order:

  1. Explicit agent ID — if the session token contains jet_agent_id, route to that specific agent.
  2. IP subnet match — if the target is an IP address, find agents whose advertised subnets contain it.
  3. Domain suffix match — if the target is a hostname, find agents whose advertised domains match by longest suffix (e.g., db01.finance.contoso.local matches finance.contoso.local over contoso.local).
  4. No match — direct connection (gateway connects to the target itself, no tunnel).

When multiple agents match the same target, the most recently seen agent is tried first.
If it fails, the next candidate is tried (automatic failover).

Resilience

  • Agent auto-reconnects if the QUIC connection drops (exponential backoff, 1s–60s, with jitter).
  • Config re-read on every reconnection attempt (admin can change subnets without restarting the service).
  • Heartbeat monitoring — agents are marked offline after 90 seconds without a heartbeat.
  • Graceful shutdown — agent sends QUIC close frame, gateway immediately unregisters it from routing.

1 similar comment
@irvingoujAtDevolution
Copy link
Copy Markdown
Contributor Author

QUIC Agent Tunnel — Technical Specification

1. Enrollment

How an agent gets its certificate

Admin                        Agent Machine                  Gateway
  │                              │                             │
  │  Click "Enroll Agent"        │                             │
  │  in DVLS / Gateway webapp    │                             │
  │                              │                             │
  │  Copy enrollment string      │                             │
  │  dgw-enroll:v1:base64...    │                             │
  │                              │                             │
  │  Paste into MSI installer    │                             │
  │  or CLI command              │                             │
  │                              │                             │
  │                              │  1. Decode enrollment string │
  │                              │     → gateway URL            │
  │                              │     → one-time token         │
  │                              │     → QUIC endpoint          │
  │                              │                             │
  │                              │  2. Generate ECDSA P-256     │
  │                              │     key pair LOCALLY         │
  │                              │     Write key to disk (0600) │
  │                              │                             │
  │                              │  3. Generate CSR             │
  │                              │     (public key + signature) │
  │                              │                             │
  │                              │── POST /enroll ────────────>│
  │                              │   { agent_name, csr_pem }   │
  │                              │   Bearer: <one-time-token>  │
  │                              │                             │
  │                              │                             │  4. Validate token
  │                              │                             │     (consumed, cannot replay)
  │                              │                             │  5. Verify CSR signature
  │                              │                             │  6. Assign agent UUID
  │                              │                             │  7. Sign cert with CA
  │                              │                             │     (embed UUID in SAN)
  │                              │                             │
  │                              │<────────────────────────────│
  │                              │   { agent_id, cert_pem,     │
  │                              │     ca_cert_pem, endpoint } │
  │                              │                             │
  │                              │  8. Write cert + CA cert     │
  │                              │  9. Update agent.json        │
  │                              │     (Tunnel section only,    │
  │                              │      preserves other config) │
  │                              │                             │
  │                              │  10. Connect via QUIC ──────>│
  │                              │      (mTLS with new cert)    │

Key property: the private key never leaves the agent machine.
Only the CSR (containing the public key and a proof-of-possession signature) is transmitted.
The enrollment response contains only the signed certificate and the CA certificate — no secrets.

Enrollment token

The enrollment token is either:

  • A one-time UUID (122-bit entropy) generated by the gateway — consumed atomically on use, cannot be replayed.
  • A static secret from gateway configuration — compared in constant time.

2. Stream Multiplexing

One QUIC connection, many independent streams

Agent ←──── single QUIC connection ────→ Gateway
             │
             ├── Stream 0 (control, always open)
             │   Agent → GW:  RouteAdvertise every 30s
             │   Agent → GW:  Heartbeat every 60s
             │   GW → Agent:  HeartbeatAck
             │
             ├── Stream 1 (RDP session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.5:3389 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (RDP protocol data)
             │
             ├── Stream 5 (SSH session #1)
             │   GW → Agent:  ConnectMessage { target: 10.0.0.10:22 }
             │   Agent → GW:  ConnectResponse::Success
             │   Then: raw bidirectional bytes (SSH protocol data)
             │
             └── Stream 9 (SSH session #2)
                 GW → Agent:  ConnectMessage { target: 10.0.0.20:22 }
                 Agent → GW:  ConnectResponse::Success
                 Then: raw bidirectional bytes (SSH protocol data)

Each stream is independently ordered.
A retransmission on stream 1 does not block streams 5 or 9.
This is QUIC's core advantage over TCP — no head-of-line blocking across streams.

How a new session is established

  1. Gateway allocates the next server-initiated stream ID (1, 5, 9, 13, …).
  2. Gateway writes a length-prefixed ConnectMessage to the new stream.
  3. Agent reads the stream, decodes the ConnectMessage.
  4. Agent validates the target IP is within its advertised subnets (security boundary).
  5. Agent opens a TCP connection to the target.
  6. Agent writes ConnectResponse::Success back on the same stream.
  7. From this point, every byte on the QUIC stream is forwarded 1:1 to/from the TCP connection.

No new QUIC handshake is needed — streams are opened instantly on the existing connection.

Message encoding

All control and session setup messages use length-prefixed bincode:

┌─────────────────────────┬──────────────────────────────┐
│ 4 bytes (big-endian u32)│ N bytes (bincode payload)    │
│ message_length = N      │                              │
└─────────────────────────┴──────────────────────────────┘

After ConnectResponse::Success, the stream carries raw bytes — no framing, no headers.
The gateway and agent act as transparent TCP proxies.

Size limits

Message type Max size Purpose
Control messages 1 MiB RouteAdvertise, Heartbeat
Session messages 64 KiB ConnectMessage, ConnectResponse

Limits are enforced on the length prefix (before reading the payload) and on the bincode deserializer (prevents crafted payloads with huge internal Vec lengths).

3. User Experience

Network topology

┌─────────────────────────────────────────────────────────┐
│  Cloud                                                   │
│  ┌──────────────────┐                                   │
│  │ Devolutions      │                                   │
│  │ Gateway          │  ← publicly reachable              │
│  │ gateway.acme.com │                                   │
│  └────────┬─────────┘                                   │
│           │ QUIC (UDP 4433)                              │
└───────────┼─────────────────────────────────────────────┘
            │
       ─ ─ ─│─ ─ ─ ─ ─ ─ firewall (outbound only) ─ ─ ─ ─
            │
┌───────────┼─────────────────────────────────────────────┐
│  Office   │                                              │
│  ┌────────┴─────────┐    ┌──────────┐  ┌──────────┐    │
│  │ Agent            │    │ DC       │  │ File     │    │
│  │ 10.10.0.8        │───→│ 10.10.0.3│  │ Server   │    │
│  │ advertises:      │    │ (RDP+KDC)│  │ 10.10.0.5│    │
│  │  10.10.0.0/24    │    └──────────┘  └──────────┘    │
│  │  contoso.local   │                                   │
│  └──────────────────┘                                   │
└─────────────────────────────────────────────────────────┘

Admin setup (one-time)

  1. Open Gateway webapp → Agents → Enroll Agent.
  2. Copy the enrollment string.
  3. On the agent machine: devolutions-agent up --enrollment-string "dgw-enroll:v1:...".
  4. Agent enrolls, connects, starts advertising 10.10.0.0/24 + contoso.local.

End-user workflow (daily use)

The user has no awareness of the agent. From their perspective:

  1. Open RDM or Gateway webapp.
  2. Create an RDP connection to 10.10.0.3.
  3. Click connect.
  4. The RDP desktop appears.

What happens behind the scenes:

User's browser
  → WebSocket to Gateway (gateway.acme.com)
    → Gateway routing: 10.10.0.3 matches agent's 10.10.0.0/24 subnet
      → Gateway opens QUIC stream 5 to agent
        → ConnectMessage { target: "10.10.0.3:3389" }
          → Agent connects TCP to 10.10.0.3:3389
            → ConnectResponse::Success
              → RDP data flows bidirectionally

No VPN. No inbound firewall rules on the office network. No routing configuration.

Transparent routing rules

When a connection request arrives, the gateway evaluates routing in priority order:

  1. Explicit agent ID — if the session token contains jet_agent_id, route to that specific agent.
  2. IP subnet match — if the target is an IP address, find agents whose advertised subnets contain it.
  3. Domain suffix match — if the target is a hostname, find agents whose advertised domains match by longest suffix (e.g., db01.finance.contoso.local matches finance.contoso.local over contoso.local).
  4. No match — direct connection (gateway connects to the target itself, no tunnel).

When multiple agents match the same target, the most recently seen agent is tried first.
If it fails, the next candidate is tried (automatic failover).

Resilience

  • Agent auto-reconnects if the QUIC connection drops (exponential backoff, 1s–60s, with jitter).
  • Config re-read on every reconnection attempt (admin can change subnets without restarting the service).
  • Heartbeat monitoring — agents are marked offline after 90 seconds without a heartbeat.
  • Graceful shutdown — agent sends QUIC close frame, gateway immediately unregisters it from routing.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the first slice of a QUIC/mTLS “agent tunnel” system: a shared binary protocol crate, a Gateway-side QUIC listener/registry/enrollment API, and an Agent-side enrollment + reconnecting tunnel client. This enables routing Gateway-initiated TCP proxy sessions through outbound-connected agents (for private-network reachability).

Changes:

  • Introduces agent-tunnel-proto crate (control/session messages, framing, protocol versioning).
  • Adds Gateway agent-tunnel core (agent_tunnel module), config wiring, REST endpoints, and token claim support (jet_agent_id) used in the forwarding path.
  • Adds Agent enrollment/bootstrap + QUIC tunnel client with auto-reconnect and domain auto-detection.

Reviewed changes

Copilot reviewed 35 out of 36 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
devolutions-gateway/tests/config.rs Updates config samples to include agent_tunnel field.
devolutions-gateway/src/token.rs Adds jet_agent_id to association claims; adjusts scope token claims serialization/visibility.
devolutions-gateway/src/service.rs Initializes and registers the agent-tunnel listener task when enabled.
devolutions-gateway/src/ngrok.rs Threads agent_tunnel_handle into the TCP tunnel client path.
devolutions-gateway/src/middleware/auth.rs Adds auth exception for /jet/agent-tunnel/enroll (self-auth via bearer token).
devolutions-gateway/src/listener.rs Threads agent_tunnel_handle into the generic client path.
devolutions-gateway/src/lib.rs Exposes agent_tunnel module and adds agent_tunnel_handle to DgwState.
devolutions-gateway/src/generic_client.rs Uses jet_agent_id to route Fwd connections through the agent tunnel.
devolutions-gateway/src/extract.rs Adds request extractors for agent-management read/write access control.
devolutions-gateway/src/config.rs Adds AgentTunnelConf to Gateway config DTO and runtime config.
devolutions-gateway/src/api/webapp.rs Ensures new jet_agent_id claim is present (set to None) when minting tokens.
devolutions-gateway/src/api/mod.rs Nests the new /jet/agent-tunnel/* router.
devolutions-gateway/src/api/agent_enrollment.rs Implements enrollment + agent management endpoints (list/get/delete/resolve-target).
devolutions-gateway/src/agent_tunnel/mod.rs Declares agent-tunnel submodules and re-exports core types.
devolutions-gateway/src/agent_tunnel/listener.rs QUIC UDP listener event loop + proxy-stream request dispatching.
devolutions-gateway/src/agent_tunnel/enrollment_store.rs In-memory single-use enrollment token store with expiry.
devolutions-gateway/src/agent_tunnel/stream.rs Tokio AsyncRead/AsyncWrite wrapper over QUIC streams via channels.
devolutions-gateway/src/agent_tunnel/registry.rs Agent registry with heartbeat liveness + subnet/domain routing selection.
devolutions-gateway/src/agent_tunnel/connection.rs Managed quiche connection: handshake identity, control parsing, proxy stream setup.
devolutions-gateway/src/agent_tunnel/cert.rs CA manager for enrollment signing + server cert issuance and cert parsing helpers.
devolutions-gateway/Cargo.toml Adds QUIC/proto/cert/routing dependencies for the tunnel feature.
devolutions-agent/src/service.rs Registers TunnelTask when tunnel is enabled; fixes conf_handle cloning for RDP task.
devolutions-agent/src/main.rs Adds CLI support for enroll/up bootstrap flows and parsing helpers + tests.
devolutions-agent/src/lib.rs Exposes new modules: tunnel, enrollment, domain_detect.
devolutions-agent/src/enrollment.rs Implements enrollment request + persistence of certs/config merge.
devolutions-agent/src/domain_detect.rs Adds Windows/Linux DNS domain auto-detection helper.
devolutions-agent/src/tunnel.rs Implements reconnecting QUIC client + control/session stream handling and TCP proxying.
devolutions-agent/src/config.rs Adds tunnel config section; makes save_config/get_conf_file_path public.
devolutions-agent/Cargo.toml Adds proto/quiche/reqwest/rcgen dependencies and Windows feature for domain detection.
crates/agent-tunnel-proto/src/lib.rs Defines the protocol crate API surface and exports.
crates/agent-tunnel-proto/src/version.rs Adds protocol version constants + validation helper.
crates/agent-tunnel-proto/src/error.rs Defines protocol-level error types.
crates/agent-tunnel-proto/src/control.rs Adds control-plane message definitions + framed encode/decode.
crates/agent-tunnel-proto/src/session.rs Adds session-plane message definitions + framed encode/decode.
crates/agent-tunnel-proto/Cargo.toml New crate manifest and dependencies.
Cargo.lock Locks new dependencies introduced for QUIC, cert handling, registry, and protocol crate.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread devolutions-gateway/src/api/agent_enrollment.rs Outdated
Comment thread devolutions-gateway/src/api/agent_enrollment.rs Outdated
Comment thread devolutions-gateway/src/extract.rs
Comment thread devolutions-gateway/src/agent_tunnel/listener.rs Outdated
Comment thread devolutions-gateway/src/agent_tunnel/connection.rs Outdated
Comment thread crates/agent-tunnel-proto/src/session.rs Outdated
Comment thread crates/agent-tunnel-proto/src/control.rs Outdated
Comment thread devolutions-agent/src/tunnel.rs Outdated
Comment thread devolutions-agent/src/tunnel.rs Outdated
Comment thread devolutions-gateway/src/generic_client.rs
Comment thread devolutions-gateway/src/api/mod.rs Outdated
Comment thread devolutions-gateway/src/api/agent_enrollment.rs Outdated
Comment thread devolutions-gateway/src/api/tunnel.rs
Comment thread crates/agent-tunnel-proto/src/session.rs Outdated
Comment thread devolutions-agent/src/tunnel.rs Outdated
Comment thread devolutions-agent/src/tunnel.rs Outdated
Comment thread crates/agent-tunnel-proto/src/session.rs Outdated
Comment thread devolutions-gateway/src/api/tunnel.rs
Comment thread crates/agent-tunnel-proto/src/version.rs Outdated
Comment thread devolutions-gateway/src/api/tunnel.rs
Add QUIC-based agent tunnel core infrastructure. Agents in private
networks connect outbound to Gateway via QUIC/mTLS, advertise reachable
subnets and domains, and proxy TCP connections on behalf of Gateway.

Protocol (agent-tunnel-proto crate):
- RouteAdvertise with subnets + domain advertisements
- ConnectMessage/ConnectResponse for session stream setup
- Heartbeat/HeartbeatAck for liveness detection
- Protocol version negotiation (v2)

Gateway (agent_tunnel module):
- QUIC listener with mTLS authentication
- Agent registry with subnet/domain tracking
- Certificate authority for agent enrollment
- Enrollment token store (one-time tokens)
- Bidirectional proxy stream multiplexing

Agent (devolutions-agent):
- QUIC client with auto-reconnect and exponential backoff
- Agent enrollment with config merge (preserves existing settings)
- Domain auto-detection (Windows: USERDNSDOMAIN, Linux: resolv.conf)
- Subnet validation on incoming connections
- Certificate file permissions (0o600 on Unix)

API endpoints:
- POST /jet/agent-tunnel/enroll — agent enrollment
- GET /jet/agent-tunnel/agents — list agents
- GET /jet/agent-tunnel/agents/{id} — get agent
- DELETE /jet/agent-tunnel/agents/{id} — delete agent
- POST /jet/agent-tunnel/agents/resolve-target — routing diagnostics

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ConnectMessage → ConnectRequest (precise naming)
- Move encode/decode into ControlStream/SessionStream wrappers
  (actor-on-object: ctrl.send(&msg) instead of msg.encode(&mut stream))
- ControlStream.into_split() → ControlSendStream + ControlRecvStream
  (compile-time separation, no phantom halves)
- From<(S, R)> for stream wrappers (connection.open_bi().await?.into())
- Rename spawned tasks: run_control_reader, run_session_proxy,
  run_agent_connection, run_control_loop
- Spawned tasks own args and handle errors internally
- Collect JoinHandles, abort all on shutdown
- Extract helpers to tunnel_helpers.rs
- Document backoff strategy with examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CaManager::load_or_generate returns Arc<Self> directly
- Rename enrollment token consume → redeem
- Remove unused resolve-target API endpoint + helpers + tests
- Remove routing methods from registry (PR2 scope)
- Remove Option from RouteAdvertisementState (empty = no routes)
- Target enum for typed IP vs domain parsing
- Prefix variables clearly (server_cert_*, ca_*)
- Add TODO for traffic audit and Windows DACL
- Backoff strategy documented with examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address review feedback from Benoit and Marc-André:

- Rename HTTP mountpoint /jet/agent-tunnel → /jet/tunnel
- Replace SkipHostnameVerification with SpkiPinnedVerifier that
  performs full chain + hostname + SPKI pin validation
- Enrollment response now includes server_spki_sha256 for pinning
- Agent sends machine hostname; gateway adds it as DNS SAN alongside
  the UUID SAN (dual names for future direct connectivity)
- Agent connects using real gateway hostname instead of dummy value
- Move sha2/hex to cross-platform deps, add x509-parser + hostname

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove agent_name from EnrollResponse (agent knows it already)
- Agent generates its own UUID and sends it in EnrollRequest
- Rename api/agent_enrollment.rs → api/tunnel.rs (match endpoint)
- Use backoff crate for reconnect loop (same pattern as subscriber.rs)
- ALPN: "devolutions-agent-tunnel" → "gw-agent-tunnel/1" (versioned)
- Protocol version: 2 → 1 (previous was experimental, start fresh)
- Move session tests to integration test file (public API only)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SanType::Rfc822Name → SanType::URI for urn:uuid: (correct X.509 type)
- GeneralName::RFC822Name → GeneralName::URI in extraction
- Reject duplicate agent UUID on enrollment (409 Conflict)
- tokio::join! instead of select! for session proxy (prevents data loss)
- JoinSet instead of Vec<JoinHandle> (prevents unbounded growth)
- Timeout (30s) on session handshake recv_request/recv_response
- Fix typos: "redeemd" → "redeemed"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move current_time_millis() to agent-tunnel-proto (R1: eliminate duplication)
- Delete DomainInfo, use DomainAdvertisement directly in AgentInfo (R2)
- Merge enroll_agent/bootstrap_and_persist into single function (I1)
- Agent task_handles: Vec<JoinHandle> → JoinSet with reaping (I4)
- Same-epoch route refresh: mutate updated_at in place, no clone (I5)
- Add #[must_use] on enrollment_store::redeem() (I6)
- connect_via_agent: cleaner error extraction with if-let (I3)
- Add TODO for active_stream_count tracking (I2)
- SECS_PER_DAY constant replaces magic 86400 (P4)
- Consistent .context() for ProtoError instead of map_err (P7)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread devolutions-gateway/src/api/mod.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 39 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/agent-tunnel/src/cert.rs
Comment thread crates/agent-tunnel/src/listener.rs
Comment thread devolutions-agent/src/tunnel.rs
Comment thread crates/agent-tunnel-proto/src/stream.rs Outdated
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Hoist protocol version validation before match in both gateway and
  agent control loops (single check, no per-variant boilerplate)
- Validate ConnectResponse protocol version in connect_via_agent
- ServerCertStatus enum for ensure_server_cert (expiry + hostname SAN)
- send.finish() after proxy copy (graceful QUIC EOF)
- Fix constant_time_eq doc (inaccurate timing claim)
- Extract ALPN to agent_tunnel_proto::ALPN_PROTOCOL constant
- Destruct EnrollResponse at parameter level for readability
- ValidatedTunnelConf: make wrong state unrepresentable at type level
  (dto::TunnelConf for JSON, TunnelConf for runtime with non-optional fields)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread devolutions-agent/src/main.rs Outdated
Comment on lines +158 to +175
fn parse_enrollment_string(value: &str) -> Result<EnrollmentStringPayload> {
const PREFIX: &str = "dgw-enroll:v1:";

let encoded = value.strip_prefix(PREFIX).context("invalid enrollment string prefix")?;

let decoded = base64::engine::general_purpose::URL_SAFE_NO_PAD
.decode(encoded)
.context("invalid base64 enrollment string")?;

let payload: EnrollmentStringPayload =
serde_json::from_slice(&decoded).context("invalid enrollment string payload")?;

if payload.version != 1 {
bail!("unsupported enrollment string version: {}", payload.version);
}

Ok(payload)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Switch to JWT instead of a custom format

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Use the same approach as jmux-proto. Do not use serde and bincode.

Comment thread devolutions-gateway/Cargo.toml Outdated
base64 = "0.22"
bincode = "1.3"
ipnetwork = "0.20"
dashmap = "6.1"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do we really need dashmap?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I see a lot of new dependencies. Maybe reevaluate the dependencies what is absolutely necessary and what could be removed. I see pull multiple libraries to parse PEM files… Pretty sure we already had something before pem and rustls-pem.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Extract more logic into a separate crate, the same way we did for the network scanner. agent-tunnel-proto (already existing) + agent-tunnel.

- `dashmap::DashMap` → `tokio::sync::RwLock<HashMap>` in
  `enrollment_store`, `listener`, `registry`. All lookups/inserts await
  the lock; values are cloned out (Arc/quinn::Connection) so no guards
  escape the critical section.
- `pem` crate → `rustls_pemfile::certs` via a small `cert_pem_to_der`
  helper in `cert.rs`. The CSR tamper test now uses `base64` directly
  for PEM encode/decode.
- `bincode` + `serde` → hand-rolled binary encoding in
  `agent-tunnel-proto`, following the `jmux-proto` pattern:
  * `FramedSend<S>` / `FramedRecv<R>` handle length-prefixed framing
    and encode/decode via private `Encode` / `Decode` traits.
  * `ControlStream` / `SessionStream` compose `FramedSend` +
    `FramedRecv` with their respective max frame sizes; no more free
    `write_framed` / `read_framed` helpers.
  * Each `ControlMessage` / `ConnectRequest` / `ConnectResponse`
    variant has an explicit wire layout with tag bytes, big-endian
    integers, u32-length-prefixed strings, and explicit IPv4 framing.
  * `serde` becomes an optional feature on the proto crate, enabled by
    `devolutions-gateway` for its JSON API (`DomainAdvertisement`
    serialization); `devolutions-agent` drops it entirely.

All 18 proto tests (roundtrip + proptest) pass unchanged.
Addresses Benoit's review comment: "Switch to JWT instead of a custom
format". The old `dgw-enroll:v1:<base64-JSON>` envelope is replaced with
a standard JWT that carries the same information via JWT claims and
doubles as the Bearer token for `/jet/tunnel/enroll`.

Gateway:
- Add `AccessScope::TunnelEnroll` and a dedicated `EnrollmentTokenClaims`
  struct with `scope`, `exp`, `jti`, `jet_gw_url` (required) and
  `jet_agent_name` (optional). The `jet_*` prefix matches the existing
  convention for gateway-specific custom claims (`jet_aid`, `jet_ap`,
  `jet_gw_id`, ...).
- Add `validate_enrollment_jwt` in `api/tunnel.rs` (ported from feature
  branch). Verifies signature against `provisioner_public_key`, checks
  `exp`/`nbf` via picky's strict validator, and enforces scope is
  `TunnelEnroll` or `Wildcard`.
- `enroll_agent` now tries JWT first, then the one-time token store,
  then the static `enrollment_secret` as a fallback.
- 7 unit tests cover the happy path, wildcard scope, wrong scope,
  expiry, signature mismatch, missing required claim, and malformed
  input.

Agent:
- Replace `EnrollmentStringPayload` / `parse_enrollment_string` with
  `EnrollmentJwtClaims` / `parse_enrollment_jwt`. The parser splits on
  `.` and decodes the payload segment without verifying the signature
  (agent is the intended recipient; the Gateway verifies on enrollment).
- The JWT string itself becomes the Bearer token — no more separate
  `enrollment_token` field nested inside a custom envelope.
- 3 tests: happy path via `parse_up_command_args`, malformed rejection,
  and missing-`jet_gw_url` rejection.

Also fixes the pre-existing inline registry tests that broke in the
previous commit when `DashMap` → `tokio::sync::RwLock<HashMap>` made
`AgentRegistry` methods async and `DomainAdvertisement.domain` became
a `DomainName` newtype.
Addresses Benoit's review comment: "Extract more logic into a separate
crate, the same way we did for the network scanner. `agent-tunnel-proto`
(already existing) + `agent-tunnel`."

The agent tunnel module was already self-contained (zero `use crate::*`
imports), so the extraction is a mechanical move:

- Create `crates/agent-tunnel/` as a new workspace crate
- Move `cert.rs`, `enrollment_store.rs`, `listener.rs`, `registry.rs`,
  `stream.rs` from `devolutions-gateway/src/agent_tunnel/` (git tracks
  these as renames)
- New `lib.rs` does the `#[macro_use] extern crate tracing` dance and
  re-exports the public surface (`AgentTunnelHandle`,
  `AgentTunnelListener`, `AgentRegistry`, `EnrollmentTokenStore`,
  `TunnelStream`)
- Delete `devolutions-gateway/src/agent_tunnel/mod.rs`
- Gateway now depends on `agent-tunnel` as a path dependency; call
  sites change `crate::agent_tunnel::*` → `agent_tunnel::*`

Also promote `Encode` / `Decode` in `agent-tunnel-proto::codec` from
`pub(crate)` to `pub` so `FramedSend::send` / `FramedRecv::recv` (which
bound on them) are reachable in the new crate without `private_bounds`
warnings.

Tests: 20 moved from gateway inline into the new crate and all still
pass; gateway still has 64 lib tests + all integration tests green;
agent + proto tests untouched.
Review-agent findings addressed:

- Drop `ControlMessage`/`ConnectRequest`/`ConnectResponse` inherent
  `encode`/`decode` methods. They duplicated the `Encode`/`Decode`
  trait impls with identical signatures, so callsites and rustdoc saw
  two methods for one job. Only the trait impls remain; stream wrappers
  already go through the traits.
- `RouteAdvertisementState::update_routes` same-epoch branch now logs
  the *incoming* subnet/domain counts (previously re-logged the
  existing state's count, which read as if we had accepted the new
  set) and makes it explicit in the message that incoming routes are
  ignored.
- Rename `constant_time_eq` → `timing_safe_eq`. The function hashes
  inputs first and only the 32-byte digest compare is constant-time.
  New name describes intent; doc comment now explains both what the
  hash normalization buys and what the function does *not* guarantee.
- Document that `EnrollmentTokenStore::redeem` removes expired tokens
  as a side effect (so callers cannot distinguish "missing" from
  "expired", and shouldn't).
- Explain in `parse_enrollment_jwt` why we handroll the split/decode
  instead of pulling `picky` into the agent for unverified payload
  reading.
- Move `use agent_tunnel_proto::current_time_millis;` to the top of
  `registry.rs` with the other imports (was dangling at module bottom
  after the IPv4-only revert).
- Apply `cargo fmt`.

Tests: 20 agent-tunnel + 13 agent-tunnel-proto + 5 session_roundtrip
+ 64 gateway lib + 5 devolutions-agent, all green. Zero clippy
warnings on the changed crates.
- Drop the 1-byte IP family tag from each subnet on the wire. The type
  is `Ipv4Network` so the tag could only ever be `0x04`. Encoding it
  was a TODO-by-bytes that would have constrained a future v2 without
  helping v1. Each subnet is now `[4B ipv4_octets][1B prefix]` — saves
  a byte per subnet per RouteAdvertise. If IPv6 arrives, the wire bump
  comes with a `protocol_version` bump and the format can reintroduce
  a tag cleanly.
- Add six unit tests for `DomainName::matches_hostname` covering exact
  match, case insensitivity, suffix match, rejected partial-label
  ("fakecontoso.local" vs "contoso.local"), unrelated hostname, and
  parent vs child domain. The method is only called from PR2's routing
  code; these tests make sure the algorithm is locked down on PR1 so
  the PR2 consumer can rely on it.
- `devolutions-agent/src/tunnel.rs`: replace the `continue;` on
  backoff exhaustion with a fall-through using a 1s floor. Previously,
  if `backoff.next_backoff()` ever returned `None` (supposedly
  unreachable with `max_elapsed_time(None)`), the loop would spin
  without any sleep. Defensive fix, not a correctness one.

All 20 agent-tunnel / 19 agent-tunnel-proto / 64 gateway-lib / 5
session_roundtrip / 5 devolutions-agent tests still pass. Zero clippy
warnings on the changed crates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants